Skip to content

Conversation

@dricross
Copy link

@dricross dricross commented Oct 17, 2025

Description

Creating test cases for testing the classic histogram mapping algorithm. This includes both raw datasets (simple values) for testing accuracy and synthetic histograms for testing correctness.

I put this in a new Go package under pkg/aws as we want to use the mapping algorithm in both opentelemetry-collector-contrib and amazon-cloudwatch-agent.

Testing

New unit tests

Unit tests for receiver/elasticsearchreceiver are failing on GitHub runner (they pass locally):

=== FAIL: . TestIntegration/7.16.3 (83.92s)
    scraperint.go:113: 
        	Error Trace:	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:113
        	Error:      	Condition never satisfied
        	Test:       	TestIntegration/7.16.3
    scraperint.go:95: number of resources doesn't match expected: 4, actual: 0

These changes should have no impact on that receiver as this PR introduces and entirely new, unreferenced module (outside of version.yaml).

The same test is failing for aws-cwa-dev as well: https://github.com/amazon-contributing/opentelemetry-collector-contrib/actions/runs/18651721544

@dricross dricross marked this pull request as ready for review October 20, 2025 12:38
Paramadon
Paramadon previously approved these changes Oct 21, 2025
}

if dp.HasMax() {
if math.IsNaN(dp.Max()) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider making this a helper function to minimize duplication

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make a helper

"go.opentelemetry.io/collector/pdata/pmetric"
)

func TestCheckValidity(t *testing.T) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could add test cases for min/max

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will add.

var datasets []GeneratedDataset
for _, config := range configs {
// each dataset gets a copy of the same rand so that they don't interfere with each other
rng := rand.New(rand.NewPCG(seed, seed))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much randomness do we need and how significant is it? what happens if each config shares the same rand?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want is to be able to generate data that reasonably resembles some data distribution. We want that generated data to be reliably repeatable so that we can write integration tests around it. We could hard code all of the values, but we decided to use a set seed on a random number generator instead. If the configs share the same rand, then they will interfere with each other making the generated data not reliably repeatable.

Copy link

@movence movence Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I'm not questioning why use seed but why each config would require a new rand. They are diff distributions already so having the same rand is fine?

}

func gammaRandom(shape float64, rng *rand.Rand) float64 {
if shape < 1 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this written from scratch? some comments or a source will be helpful.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. I'll add some comments


assertOptionalFloat(t, "min", tc.Expected.Min, tc.Input.Min)
assertOptionalFloat(t, "max", tc.Expected.Max, tc.Input.Max)
assert.InDelta(t, tc.Expected.Average, tc.Input.Sum/float64(tc.Input.Count), 0.01)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is delta calc conservative or more releaxed? just asking since we have seen some flakiness with tight delta calcs before

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more relaxed than direct floating point comparisons, but its more conservative than a % diff check (e.g. (expected-actual)/actual).

}

// Rest of checks only apply if we have boundaries/counts
if len(hi.Boundaries) > 0 || len(hi.Counts) > 0 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe create local vars with lens of these since they are used a quite few in thie block

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@dricross dricross merged commit c7dcf1f into aws-cwa-dev Oct 23, 2025
152 checks passed

package cloudwatch // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/aws/cloudwatch"

// HistogramDataPoint is a single data point that describes the values of a Histogram.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Block quotes are more readable

// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package histograms // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/aws/cloudwatch/histograms"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultra small Nit: excessive comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants